Nucleic acid methylation detection process using an internal reference sample

ABSTRACT

There is disclosed a process for detection of DNA methylation at CpG sites using nucleic acid arrays and preferably microarrays. Specifically, there is disclosed a process for directly generating a reference sample from the sample to be tested and detecting methylation at large numbers of CpG island sites simultaneously. More specifically, the inventive process comprises dividing a DNA sample into two samples (a first sample and a second sample), amplifying the first DNA sample by a nucleic acid amplification process such that any methylcytosine residues are amplified as unmethylated cytosine residues, treating the amplified first sample and the (unamplified) second sample with bisulfite to convert unmethylated cytosine residues in both samples to deoxyuracil residues, labeling the bisulfite-converted second sample with a second fluorescent marker and the bisulfite-converted first sample with a first fluorescent marker, wherein the first and second fluorescent markers have non-overlapping fluorescent excitation and emission spectra; and hybridizing the first sample and the second sample onto a microarray device having a plurality of oligonucleotide capture probes designed to hybridize to CpG island sites of the DNA sample as converted and non-converted by bisulfite.

TECHNICAL FIELD

The present invention provides a process for detection of DNAmethylation at CpG sites using nucleic acid arrays and preferablymicroarrays. Specifically, the present invention provides a process fordirectly generating a reference sample from the sample to be tested anddetecting methylation at large numbers of CpG island sitessimultaneously. Specifically, the inventive process comprises dividing aDNA sample into two samples (a first sample and a second sample),amplifying the first DNA sample by a nucleic acid amplification processsuch that any methylcytosine residues are amplified as unmethylatedcytosine residues, treating the amplified first sample and the(unamplified) second sample with bisulfite to convert unmethylatedcytosine residues in both samples to deoxyuracil residues, labeling thebisulfite-converted second sample with a second fluorescent marker andthe bisulfite-converted first sample with a first fluorescent marker,wherein the first and second fluorescent markers have non-overlappingfluorescent excitation and emission spectra; and hybridizing the firstsample and the second sample onto a microarray device having a pluralityof oligonucleotide capture probes designed to hybridize to CpG islandsites of the DNA sample as converted and non-converted by bisulfite.

BACKGROUND ART

Methylation Assay Processes

Methylation of cytosines (C) in the 5′ position of the pyrimidine ringhas been shown to be an important epigenetic determinant if a cell ortissue sample is cancerous. In animals, methylcytosine is mainly foundin cytosine-guanine (CpG) dinucleotides, whereas in plants it is mostoften found in cytosine-any base-guanine (CpNpG) trinucleotidesequences.

Methylation of C residues in genomic DNA plays a key role in regulationof gene expression (Wolffe et al., Proc. Natl. Acad. Sci. USA96:5894-5896, 1999) because the presence of 5-methylcytosine in thepromoter of specific genes alters the binding of transcriptional factorsand other promoters to DNA (Costello and Plass, J. Med. Genet.38:285-503, 2001). Further, 5-methylcytosine in the promoter of specificgenes also attracts methyl-DNA binding proteins and histone deacetylasesthat modify chromatin structure around the gene transcription site. Botheffects result in blocking transcription and cause gene silencing (Bird,Nature 321:209-213, 1986).

Generally, levels of methylcystine occurrence in genomic DNA have beenmeasured using two different general processes, including processesemploying high-performance separation techniques or byenzymatic/chemical means. In order to perfect large scale screeningtechniques, the enzymatic/chemical means are preferred because they donot require expensive and complex analytical equipment. However, theenzymatic/chemical techniques have not been as sensitive ashigh-performance separation techniques and the resolution is oftenrestricted to endonuclease cleavage sites.

Two alternative approaches have been tried for DNA methylationdetection, bisulfite methods and non-bisulfate methods. Non-bisulfatemethods use methylation-sensitive restriction endonucleases combinedwith Southern blot analysis or PCR detection, but often results arelimited to cleavage sites. Bisulfite modification of DNA allows forquantitative determination of methylation status of an allele andrequires PCR amplification of bisulfate-modified DNA. Differences inmethylcytosine patterns are displayed by methylation-dependent primerdesigns (i.e., methylation-specific PCR) in conjunction withmethylation-sensitive restriction endonucleases, genomic sequencing orother approaches.

Bisulfite treatment of DNA converts unmethylated cytosine to uracil,while methylated cytosine does not react (Furuichi et al., Biochem.Biophys. Res. Commun. 41:1185-1191, 1970). Bisulfate modification ofgenomic DNA requires prior DNA denaturation because only methylcytosinesthat are located in single strands are susceptible to attack (Shapiro etal., J. Am. Chem. Soc. 96:206-212, 1974). However, there are problemsassociated with bisulfite treatment, including, for example, onlypartial denaturation (Rein et al., J. Biol. Chem. 272:10021-10029,1997), renaturation problems in high salt concentrations, and incompletedesulfonation after bisulfate treatment (Thomassin et al., Methods19:465-475, 1999). Moreover, the total conversion of cytosines touracils is critical to the analysis, so temperature, time and pHconditions are critical without destroying the integrity of the DNAmaterial.

In bisulfite modification methylation detection processes, the moststraightforward way of measuring methylation at CpG islands is bysequencing. However, sequencing techniques are also the most difficult(time consuming and expensive) and do not allow for multiplexing oflarge numbers of scattered CpG island sites in genomic DNA samples. Ingeneral, after denaturation and bisulfite modification of a genomic DNAsample, the resulting dsDNA is obtained by primer extension and thefragment of interest is amplified by PCR techniques (Clark et al., Nucl.Acids Res. 22:2990-2997, 1994). Standard DNA sequencing of the PCRproducts then detects Methylcytosine. Alternatively, one could clone thePCR products into plasmid vectors followed by sequencing of individualclones for a slowed method but one that could also provide methylationmaps of single DNA molecules. In another variation, direct localizationof methylcytosines in the product of bisulfite treatment instead of thePCR product can be done using only three deoxynucleotides (dATP, dCTPand dTTP) but lacking dGTP that produces an elongation stop atmethylcytosine points (Radlinska and Skowronek, Acta Microbiol. Pol.47:327-334, 1998).

Another process in the bisulfite class is methylation-specific PCR(Esteller et al., Cancer Res. 61:3225-3229, 2001; and Herman et al.,Proc. Natl. Acad. Sci. USA 93:9821-9826, 1996), also called MSP. Innormal (non-cancerous) cells, cytosines in CpG islands are usuallyunmethylated, but they become methylated in the promoter sequences ofgenes associated with certain abnormal cellular processes, such ascancer (Esteller et al., Cancer Res. 59:793-797, 1999; Esteller et al.,Cancer Res. 61:3225-3229, 2001; and Esteller et al., Hum. Mol. Genet.10:3001-3007, 2001). Bisulfite-converted DNA strands are no longercomplementary, so primer design in MSP is customized for each chain andmethylation patterns of all sequences determined in separate reactions.MSP uses a difficult PCR process and critical primer designs using anarrow range of strand annealing temperatures, the PCT product isbetween 80 and 175 base pairs, each primer should contain at least twoCpG pairs, the sense pair should contain a CpG pair at the 3′ end andprimers contain non-CpG cytosines. The MSP technique requires PCR and ifthe PCR goes for too many cycles of amplification without ensuring thatthe reaction is in the lineal response range with respect to templateconcentration, then large over-estimations of the extent of methylationcan be obtained if the sequence is amplifiable with both themethylation-specific primers and the primers for unmethylated sequences.

The MSP method was improved by combining methylation-specific PCR within situ hybridization (Nuovo et al., Proc. Natl. Acad. Sci. USA96:12754-12759, 1999) to allow for the methylation status of specificDNA sequences to be visualized in individual cells, for monitoringcomplex tissue samples having both tumor and normal cells. Anothermethod combines MSP with denaturing HPLC to allow for small cell mosaicsof structurally normal or abnormal chromosomes to be detected (Baumer etal., Hum. Mutat. 17:423-430, 2001). Specifically, following PCRamplification, the two alleles can be resolved from the two populationsof PCR products by denaturing HPLC because they differ at severalpositions within the amplified sequence.

Another quantification approach has been called MethyLight and usesfluorescent-based, real-time PCR (U.S. Pat. No. 6,331,393 the disclosureof which is incorporated by reference herein; and Eads et al., NucleicAcids Res. 28:E32, 2000). The DNA is modified by the bisulfite treatmentand amplified by fluorescence-based, real-time quantitative PCR usinglocus-specific PCR primers that flank an oligonucleotide probe with a 5′fluorescence reporter dye and a 3′ quencher dye. The reporter isenzymatically released during the reaction, and fluorescence, which isproportional to the amount of PCR product and thus to the degree ofmethylation, can be sequentially detected in an automated nucleotidesequencer device. While fluorescence increases the sensitivity of thisprocess, the process is difficult, requires expensive instrumentationand consumables and cannot be multiplexed to detected hundreds orthousands of CpG island sites simultaneously.

Another approach has been to combine methyl-sensitive endonucleases withPCR amplification with subsequent hybridization to oligonucleotidemicroarrays (Huang et al., Hum. Mol. Genetics, 8:459-70, 1999). In thiscase, methylation state was determined by digestion of unmethylated DNAusing methylation sensitive restriction enzyme. Unmethylated DNA wasenzymatically digested into fragments and did not generate ampliconsafter PCR whereas methylated DNA was protected from digestion and didgenerate amplicons after PCR. The presence or absence of amplicons wasdetected on oligonucleotide microarrays using fluorescent tags. Samplesfrom normal tissues were used as a control with the supposition thatthese non-cancerous samples contained predominantly unmethylatedcytosine residues. This procedure requires DNA from non-cancerous tissueto be available for use as an external control. Additionally, the exactmethylation state of the external control needs to be ascertained beforeit can be confidently used to interpret results from adual-hybridization assay.

Another approach has been to perform a dual-hybridization assay using atest sample and an external reference sample known to be unmethylated inthe analyzed region (Balog et al., Anal Biochem. 309: 301-310, 2002). Inthis case, a 190-bp DNA duplex was synthesized and used as an externalreference sample, or DNA was obtained from a sample known to beunmethylated. The two samples were labeled with different fluorescentdyes, mixed and hybridized to an array containing 21meroligonucleotides. The external reference sample generated signal in areference fluorescent channel on capture probes hybridizing to athymidine residue. The presence of signal on a capture molecule probingfor the presence of C within the test sample indicated methylation ofthat C residue.

Therefore, there are a variety of methylation detection processes thathave advantages and disadvantages, but none have the ability todetermine the methylation state of a large number of CpG islands withoutthe presence of an external reference sample. Therefore, there is a needin the art to incorporate processes that do not require an externalreference sample yet are able to multiplex DNA methylation assays tosimultaneously determine methylation patterns.

DNA Microarrays

In the world of microarrays or biochips, biological molecules (e.g.,oligonucleotides, polypeptides, oligopeptides and the like) are placedonto surfaces at defined locations for potential binding with targetsamples of nucleotides or receptors or other molecules. Microarrays areminiaturized arrays of biomolecules available or being developed on avariety of platforms. Much of the initial focus for these microarrayshave been in genomics with an emphasis on cellular gene expression,single nucleotide polymorphisms (SNPs) and genomic DNAdetection/validation, functional genomics and proteomics (Wilgenbus andLichter, J. Mol. Med. 77:761, 1999; Ashfari et al., Cancer Res. 59:4759,1999; Kurian et al., J. Pathol. 187:267, 1999; Hacia, Nature Genetics 21suppl.:42, 1999; Hacia et al., Mol. Psychiatry 3:483, 1998; and Johnson,Curr. Biol. 26:R171, 1998).

There are, in general, three categories of microarrays (also “DNAArrays” and “Gene Chips” but this descriptive name has been attempted tobe a trademark) having oligonucleotide content. Most often, theoligonucleotide microarrays have a solid surface, usually silicon-basedand most often a glass microscopic slide. Oligonucleotide microarraysare often made by different techniques, including (1) “spotting” bydepositing single nucleotides for in situ synthesis or completedoligonucleotides by physical means (ink jet printing and the like), (2)photolithographic techniques for in situ oligonucleotide synthesis (see,for example, Fodor U.S. Pat. No. 5,445,934 and the additional patentsthat claim priority from this priority document, (3) electrochemical insitu synthesis based upon pH based removal of blocking chemicalfunctional groups (see, for example, Montgomery U.S. Pat. No. 6,093,302the disclosure of which is incorporated by reference herein and SouthernU.S. Pat. No. 5,667,667), and (4) electric field attraction/repulsion offully-formed oligonucleotides (see, for example, Hollis et al., U.S.Pat. No. 5,653,939 and its duplicate Heller U.S. Pat. No. 5,929,208).Only the first three basic techniques can form oligonucleotides in situ,which are, building each oligonucleotide, nucleotide-by-nucleotide, onthe microarray surface without placing or attracting fully formedoligonucleotides.

The electrochemistry platform (Montgomery U.S. Pat. No. 6,093,302, thedisclosure of which is incorporated by reference herein) provides amicroarray based upon a semiconductor chip platform having a pluralityof microelectrodes. This chip design uses Complimentary Metal OxideSemiconductor (CMOS) technology to create high-density arrays ofmicroelectrodes with parallel addressing for selecting and controllingindividual microelectrodes within the array. The electrodes turned onwith current flow generate electrochemical reagents (particularly acidicprotons) to alter the pH in a small, defined “virtual flask” region orvolume adjacent to the electrode. The microarray is coated with a porousmatrix for a reaction layer material. Thickness and porosity of thematerial is carefully controlled and biomolecules are synthesized withinvolumes of the porous matrix whose pH has been altered throughcontrolled diffusion of protons generated electrochemically and whosediffusion is limited by diffusion coefficients and the bufferingcapacities of solutions.

The microarrays that are made with oligonucleotide capture probes aregenerally spotted onto glass slides. However, the glass slides are notwell suited for creating a reaction chamber with the capture probes thatform the spots as the hybridization reaction of target nucleic acidswith the capture probes is long and involves controlled conditions.Therefore, there is a need in the art to create better reaction chambersthat allow for control of hybridization conditions including stringencyconditions (e.g., temperature, gas pressures, chemical environment andpH).

DISCLOSURE OF THE INVENTION

In view of the many processes that have advantages and drawbacks forquantitative methylation determination, there is a need in the art forbeing able to multiplex many different sites or CpG islands formethylation analysis simultaneously and in parallel, preferably usingexisting DNA microarray technology. The present invention was made todevelop a methylation process adapted to DNA microarrays to takeadvantage of the multiplex capabilities of DNA microarrays formethylation analysis.

The present invention provides a process for detecting methylation atlarge numbers of CpG island sites simultaneously using a referencesample obtained from the sample to be tested, comprising:

(a) providing a sample of DNA for analysis;

(b) dividing the DNA sample a first DNA sample and a second DNA sample,whereby the first sample will become a test sample and the second samplewill become an internal reference sample;

(c) amplifying the second DNA sample by a nucleic acid amplificationprocess such that methylcytosine residues are amplified as unmethylatedcytosine residues;

(d) bisulfite converting the amplified first DNA sample and the secondDNA sample to convert unmethylated cytosine residues to deoxyuracilresidues in both samples;

(e) amplifying the converted first DNA sample and the converted secondDNA sample;

(f) labeling the bisulfite-converted second DNA sample with a secondfluorescent marker and the bisulfite-converted first DNA sample with afirst fluorescent marker, wherein the first and second fluorescentmarkers have non-overlapping fluorescent excitation and emissionspectra; and

(g) hybridizing the first DNA sample and the second DNA sample onto amicroarray device having a plurality of oligonucleotide capture probesdesigned to hybridize to CpG island sites of the DNA sample as convertedand non-converted by bisulfite.

Preferably, the amplification technique employed is PCR (polymerasechain reaction). Preferably, the hybridization conditions are highstringency. Preferably, the non-overlapping fluorescent labels are Cy3,(1,1′-bis(ε-carboxypentyl)-1′ethyl-3,3,3′,3′-tetramethylindocarbocyanine-5,5′-disulfonatepotassium salt di-N-hydroxysuccinimide ester) and Cy5(1,1′-bis(ε-carboxypentyl)-1′ethyl-3,3,3′,3′-tetramethylindodicarbocyanine-5,5′-disulfonatepotassium salt di-N-hydroxysuccinimide ester); or the non-overlappingfluorescent labels are Alexa Fluor 594 and Alexa Flour 546; or thenon-overlapping fluorescent labels are Fluorescene and Texas Red.

The present invention also provides a microarray device for using theprocess of above-mentioned invention, having a plurality ofoligonucleotide capture probes designed to hybridize to CpG island sitesof the DNA sample as converted and non-converted by bisulfite, and a kitfor the process of above-mentioned invention. Preferably, the kitcomprises the microarray device, bisulfite converting reagents and DNAlabeling reagents.

Specifically, the present invention provides:

(1) A process that simultaneously detects methylation at multiple CpGisland sites using a reference sample obtained from a sample to betested, wherein the process is a nucleic acid methylation detectionprocess that uses an internal reference sample and comprises the stepsof:

using a DNA sample for analysis, that is divided into a first DNA sampleto be tested and a second DNA sample to be the internal reference, toamplify the second DNA sample such that methylcytosine residues areamplified as unmethylated cytosine residues;

converting the unmethylated cytosine residues to deoxyuracil residues inboth the first DNA sample and the second DNA sample;

using a first fluorescent marker and a second fluorescent marker havingnon-overlapping fluorescent excitation and fluorescent emission spectrato label the first DNA sample with the first fluorescent marker and tolabel the second DNA sample with the second fluorescent marker; and

hybridizing the first DNA sample and the second DNA sample onto amicroarray device having a plurality of oligonucleotide capture probesdesigned to hybridize to CpG island sites of the DNA sample as convertedand non-converted forms;

(2) A process that simultaneously detects methylation at a large numberof CpG island sites using a reference sample obtained from a sample tobe tested, comprising:

(a) providing a DNA sample for analysis;

(b) dividing the DNA sample into a first DNA sample and a second DNAsample, whereby the first sample will become a test sample and thesecond sample will become an internal reference sample;

(c) amplifying the second DNA sample by a nucleic acid amplificationprocess such that methylcytosine residues are amplified as unmethylatedcytosine residues;

(d) bisulfite conversion of unmethylated cytosine residues intodeoxyuracil residues in both the amplified first DNA sample and thesecond DNA sample;

(e) amplifying the converted first DNA sample and the converted secondDNA sample;

(f) labeling the bisulfite-converted second DNA sample with a secondfluorescent marker and the bisulfite-converted first DNA sample with afirst fluorescent marker, wherein the first and second fluorescentmarkers have non-overlapping fluorescent excitation and emissionspectra; and

(g) hybridizing the first DNA sample and the second DNA sample onto amicroarray device having a plurality of oligonucleotide capture probesdesigned to hybridize to CpG island sites of the DNA sample as convertedand non-converted by bisulfite;

(3) The process of (1) or (2), wherein the amplification techniqueemployed is PCR (polymerase chain reaction);

(4) The process of any one of (1) to (3), wherein the hybridizationconditions are highly stringent conditions;

(5) The process of any one of (1) to (4), wherein the non-overlappingfluorescent labels are Cy3, (1,1′-bis(ε-carboxypentyl)-1′ethyl-3,3,3′,3′-tetramethylindocarbocyanine-5,5′-disulfonatepotassium salt di-N-hydroxysuccinimide ester) and Cy5(1,1′-bis(ε-carboxypentyl)-1′ethyl-3,3,3′,3′-tetramethylindodicarbocyanine-5,5′-disulfonate potassium saltdi-N-hydroxysuccinimide ester);

(6) A microarray plate for detecting methylation at cytosine sites inCpG islands in a DNA sample to be tested, on which plate the followingoligonucleotides are immobilized:

(a) an oligonucleotide comprising a sequence complementary to a DNAfragment comprising cytosine sites to be tested in the DNA sample,wherein cytosine sites other than the cytosine sites to be tested aresubstituted with thymines; and

(b) an oligonucleotide comprising a sequence complementary to a DNAfragment comprising cytosine sites to be tested in the DNA sample,wherein all the cytosine sites are substituted with thymines;

(7) A kit for detecting methylation at cytosine sites in CpG islands ina DNA sample to be tested, which comprises:

(a) the microarray plate of (6),

(b) reagents for bisulfite-conversion and/or DNA labeling reagents;

(8) A kit for detecting methylation at cytosine sites in CpG islands ina DNA sample to be tested, which comprises:

(a) an oligonucleotide comprising a sequence complementary to a DNAfragment comprising cytosine sites to be tested in the DNA sample,wherein cytosine sites other than the cytosine sites to be tested aresubstituted with thymines; and

(b) an oligonucleotide comprising a sequence complementary to a DNAfragment comprising cytosine sites to be tested in the DNA sample,wherein all the cytosine sites are substituted with thymines.

Furthermore, oligonucleotides of the present invention includepolynucleotides.

The present invention essentially provides a process wherein themethylation state of cytosine residues within CpG islands is determinedby analyzing the signal intensities at defined positions on a microarraydevice. On a microarray device, each position or site comprises adifferent capture probe oligonucleotide sequence. Therefore, multipletarget molecules can be captured in a multiplex fashion, limited only bythe number of capture probe sites available on a microarray device.Those microarray devices developed by CombiMatrix Corporation andmarketed through Roche Diagnostics (matriXarray™) for example, cancontain up to about 13,000 different sites or an ability to develop asingle assay on one chip to evaluate methylation at over 13,000 CpGislands simultaneously.

A minimum of two positions is required on a microarray device toidentify the methylation state of each cytosine residue. For example, asample containing a methylated (M) cytosine residue generates a targetmolecule that contains a cytosine residue at a specific position,whereas a sample containing an unmethylated (U) cytosine residuegenerates a target molecule that contains a uracil residue at thespecific position. For example, 5′--------------c------------3′ targetmolecule from methylated (M) sample 5′--------------u------------3′target molecule from unmethylated (U) sample.

Each of these target molecules is captured at a different position onthe microarray device using a capture oligonucleotide probe with acomplementary sequence. High stringency conditions during hybridizationand wash steps permit a specific capture of a perfectly matched moleculewith a specific capture probe, with no capture or minimal capture ofmolecules that contain a single-base mismatch between the targetmolecule and oligonucleotide capture probe.

For example, 5′---------------g--------------3′ capture oligonucleotideprobe for methylated (M) sample 5′---------------a--------------3′capture oligonucleotide probe for unmethylated (U) sample

A methylated (M) sample generates a target molecule containing acytosine residue at the original cytosine position. Its complementaryoligonucleotide capture probe on the microarray device containing aguanosine residue captures this target. An unmethylated (U) samplegenerates a target molecule containing uracil at the original cytosineposition. Its complementary oligonucleotide capture probe on themicroarray device, wherein the oligo capture probe contains an adenosineresidue at the corresponding site, captures this sample.

In a preferred embodiment of microarray hybridization assays, the targetmolecules/samples are labeled with a fluorescent dye to produce afluorescent signal that is detected by an optical detection instrument.Alternative means for detection of binding or hybridization includesvarious electrochemical detection schemes wherein the bound targetmolecule/oligonucleotide capture probe complex generates an electrode oran electric charge detectable by a nearby electrode. Thus, themethylation state of an unknown sample (test sample) can be determinedby measuring binding/hybridization (e.g., the fluorescent signal) at themethylated position (M) or unmethylated position (U) at different knownlocations on the microarray device.

A single microarray device can contain tens, hundreds or thousands ofsites, each with a different capture probe oligonucleotide sequencemolecule. The methylation state of tens, hundreds or thousands of CpGislands can be determined on a single microarray at one time. However,since determination of the methylation state of hundreds or thousands ofCpG island positions is performed at one time, non-specific or artifactevents may interfere with robust determination of methylation state ateach relevant position. The present inventive process significantly orcompletely eliminates the probability of obtaining false positives. Thisis achieved by incorporation an internal reference sample into theassay.

A reference sample, also known as a control sample, is a nucleic acidsample whose methylation state is known. Existing protocols formultiplex determination of methylation state on arrays require theavailability of a separate sample for use as a control or reference(Huang et al., Hum. Mol. Genetics 8:459-70, 1999; Balog et al., AnalBiochem. 309:301-310, 2002). The reference sample is obtained fromnormal tissue adjacent to a tumor tissue, for example. The methylationstate of the reference sample is independently determined before it canbe used as a reference or control. Alternatively, a reference sample canbe produced by chemical synthesis of DNA representing the region beingstudied. In another protocol for multiplex determination of methylationstate a reference or control sample is not used (Adorjan et al., NucleicAcid Res. 30(5):e21). In this case, the signal intensities from themethylated probe sequence are compared to the signal intensity of theunmethylated probe sequence. However, non-specific or artifact eventsmay interfere with robust ratio determination of methylation state ateach relevant position. In a preferred embodiment of the presentinvention a DNA sample is analyzed in one fluorescent channel while thesame DNA sample is used as a reference sample in another fluorescentchannel. The reference sample is prepared from the original sample suchthat any methylation in the original sample is removed to produce areference sample that is used as an internal negative control. The onlysample required for this embodiment is the DNA sample being tested. Noother DNA is required, such as synthetically generated reference DNA orDNA from non-cancerous tissue.

For example, an unknown sample may contain both methylated andunmethylated cytosine residues within CpG islands. When the unknownsample is used as a template in a polymerase chain reaction (PCR) theresultant amplicon contains only unmethylated cytosine. This is due tothe fact amplification of the template incorporates unmethylated dCTPthat is mixed into the polymerase reaction. The product of this reactionis used as a negative internal control in any hybridization assay.

An unknown sample (test) and a known sample (reference) are mixedtogether and allowed to hybridize to the microarray device. Since bothtest and reference samples may hybridize to the same capture probesequences at a particular site on the microarray device, it will beimpossible to determine how much of the signal originated from the testsample and how much of the signal originated from the reference sample,irrespective of the choice of hybridization detection means employed.According to the present inventive process, the test and referencesamples are labeled with two different fluorescent dyes so that thesignal from each source can be measured separately usingwavelength-specific detection of fluorescence. Therefore, using twodifferent fluorescent colors in a 2-color assay. In this manner, thesignal intensity of the test sample is measured by detection in a readerchannel to look for the first fluorescent dye, and the signal intensityof the reference sample is measured by the detection in a reader channelto look for the second fluorescent dye, wherein the first and secondfluorescent dye do not have overlapping emission and excitationwavelengths.

The reference samples can be prepared in a number of different ways forpresentation to a microarray device to measure hybridization. Forexample, a starting material for this methylation assay is genomic DNA.This material is isolated and purified from tissues or cells using anumber of existing methods. Purified genomic DNA is prepared formicroarray hybridization following the scheme shown in FIG. 2. Roughlyequal amounts of genomic DNA are placed in two separate tubes, one forreference sample preparation and one for test sample preparation.

The reference sample is prepared by an initial PCR step (PCR1) usingforward (F₁) and reverse (R₁) primers that are designed to anneal to thetemplate DNA at a position outside of the CpG island being tested toform an amplicon. The amplicon that is produced can have a length ofapproximately 50 base pairs or 500 base pairs to even 1000 base pairs.The amplicon is purified and treated with sodium bisulfite usingstandard protocols. The treatment converts cytosine residues todeoxyuracil residues since cytosine residues in the reference sample areunmethylated (U) after the first PCR step (PCR1). Deoxyuracil residuesbehave as thymidine residues in subsequent enzymatic and annealingreactions. One method for the sodium bisulfite conversion step for amethylation assay follows a procedure described in Frommer et al., PNAS89, 1827-1831. One method is to (1) Dilute DNA (up to 2 μg) with dH₂O to50 μl (the amount of DNA to be methylated per reaction should be keptconstant); (2) Add 5.5 μl 2M NaOH; (3) Incubate at 37° C. for 10 min todenature DNA; (4) Add 30 μl of 10 mM hydroquinone (prepare by adding 55mg to 50 mL dH₂O); (5) Add 520 μl freshly prepared 3M sodium bisulfite(prepared by adding 1.88 g sodium bisulfite to 5 mL dH₂O—adjust pH to5.0 with NaOH); (6) Mix well, incubate at about 50° C. for 16 hours; and(7) Clean up DNA (e.g., Qiagen or Promega kit or a reverse phase columne.g., 3M Empore Disk cartridges 4240). If a reverse phase column isused, (a) Add 450 microliters of 10 mM Triethanolamine, 1 mM EDTA, 0.1MTris pH 7.7; (b) Wash twice with 750 microliters of 10 mMTriethanolamine, 1 mM EDTA, 0.1M Tris pH 7.7; (c) Elute with 50/50methanol/water+0.3M NaOH; and (d) Speedvac until dry. In addition,continue the process by (8) Resuspend recovered DNA in 50 μl dH₂O, add5.5 μl of 3M NaOH, incubate at room temperature for 5 min; (9) ETOHprecipitate DNA, using a carrier such as glycogen; and (10) store theDNA like RNA (i.e., keep cold, minimize freeze thaws, store at −20° C.).

The converted DNA can now be amplified by PCR. It is important to notethat the 2 strands of converted DNA are no longer complimentary, so onehas to decide which strand (sense or antisense) to amplify. Primers aredesigned to amplify fully converted DNA (i.e., all C residues are nowT). It should be noted that theoretically, the amount of DNA to bemethylated per reaction should be kept constant. Moreover, the bisulfiteconversion process is not that efficient as one has often measured60-80% conversion of unmethylated C's. Further, controls using sss-1methylase (NEB) can be generated to estimate conversion efficiencies.

The converted genomic DNA product is used as a template in a second PCRstep (PCR2) using forward (F₂) and (R₂) primers. Since the sodiumbisulfite conversion (SBC) step generated thymidine residues at everycytosine position, F₂ and R₂ primers are designed to anneal to atemplate that contains thymidine at every cytosine position. The forwardprimer also contains an RNA polymerase promoter sequence at the 5′ endfor T7 polymerase. The purified amplicon from PCR2 is used in an invitro transcription reaction to generate single-stranded RNA moleculessuitable for hybridization to the microarray. Fluorescent dyes, such asCy3(1,1′-bis(ε-carboxypentyl)-1′ethyl-3,3,3′,3′-tetramethylindocarbocyanine-5,5′-disulfonatepotassium salt di-N-hydroxysuccinimide ester) or Cy5(1,1′-bis(ε-carboxypentyl)-1′ethyl-3,3,3′,3′-tetramethylindodicarbocyanine-5,5′-disulfonatepotassium salt di-N-hydroxysuccinimide ester) (Amersham) areincorporated into the product during transcription. In this example, Cy3UTP (Amersham Cat# PA53026) is used to fluorescently label the referencetarget RNA. In the reference sample, all cytosine residues, includingmethylated (M) and unmethylated (U) cytosine, convert to thymidineduring bisulfite conversion. The Cy3 labeled reference target RNA,therefore, contains uracil residues at every cytosine position in theoriginal starting material.

The test sample does not undergo an initial PCR step as does thereference sample. Instead, the test sample DNA is treated directly withsodium bisulfite to convert unmethylated (U) cytosine residues tothymidine. Methylated (M) cytosine residues are not converted and retaintheir cytosine structure. The resulting product is used as a template ina PCR step (PCR2) using forward (F₂) and reverse (R₂) primers. Theprimers are designed to anneal to the template DNA at a position outsideof the CpG island being tested. The amplicon that is produced can have alength of approximately 50 base pairs or 500 base pairs or even 1000base pairs.

The same PCR primers are used for the reference sample and the testsample in PCR2. In a preferred embodiment, the forward primer containsan RNA polymerase promoter sequence at the 5′ end for T7 polymerase.

Different PCR primers pairs are required for amplification of eachregion that is being queried. When several CpG islands are in closeproximity, the same pair of PCR primers is used to encompass all CpGislands in the amplicon. When CpG islands are not in close proximity,separate PCR primer pairs are used for each CpG island being queried.When PCR amplification is performed for multiple sites, amplificationreactions can be done in a multiplex fashion by combining multiple setsof PCR primers into one reaction mixture yielding multiple sets ofamplicons from different regions of the DNA template. PCR primers formultiplex reactions are designed by accurately predicting primerhybridization, evaluating template secondary structure, selectingmatching primer pairs, and identifying non-specific primer bindingsites. Products from many multiplex reactions are combined to generatepools of amplicons for tens, hundreds or even thousands of methylationsites.

The purified amplicon from PCR2 is used in an in vitro transcriptionreaction to generate single-stranded RNA molecules. The single-strandedRNA molecules are suitable for hybridization to the microarray.Fluorescent dyes, such as Cy3(1,1′-bis(ε-carboxypentyl)-1′ethyl-3,3,3′,31-tetramethylindocarbocyanine-5,5′-disulfonatepotassium salt di-N-hydroxysuccinimide ester) or Cy5(1,1′-bis(ε-carboxypentyl)-1′ethyl-3,3,3′,3′-tetramethylindodicarbocyanine-5,5′-disulfonatepotassium salt di-N-hydroxysuccinimide ester) are incorporated into theproduct during transcription. In this example, Cy5 UTP (Amersham Cat#PA55026) is used to fluorescently label the test target RNA. Therefore,in the test sample, all unmethylated (U) cytosine residues convert tothymidine during bisulfite conversion. Methylated (M) cytosine residuesdo not convert and retain their structure. The first fluorescentdye-labeled single stranded test target RNA contains uracil residues atevery cytosine position in the original starting material except forcytosine positions that were methylated. In that case, cytosine residuesare present in the test target RNA at every cytosine position in theoriginal starting material. 5′--------------c------------3′ targetmolecule from methylated (M) sample 5′--------------u------------3′target molecule from unmethylated (U) sample

The interaction between the target and the oligonucleotide capture probeon the microarray device is able to discriminate between a perfect-matchhybrid and single-mismatch hybrid by controlling hybridizationconditions. For example, highly stringent hybridization conditions todiscriminate between single base pair mismatches would be as follows: atest sample is prepared in 50 ul 6×SSPE, 5× Denhardt's reagent (Sigma),0.05% Tween 20 and hybridized with a microarray device at 50° C. for 6hours. The microarray is washed with 300 ul of 6×SSC and 0.05% Tween 20at 50° C., followed by 300 μl of 2×SSC at 48° C., followed by 300 μl of1×SSC and finally 300 μl 0.5×SSC both at room temperature.

A perfect-match hybrid generates a fluorescent signal when imaged with afluorescent microarray optical detection device (such as thosemanufactured by Axon Instruments, Agilent, Applied Precision andothers). The single-mismatch hybrid is thermodynamically unstable anddoes not form a stable hybrid. Therefore, no fluorescent signal isgenerated at that position on the microarray device.

A single base difference between a methylated and unmethylated sample isidentified on the microarray by the presence or absence of signal atpositions containing the complementary sequence to each target moleculein solution. Since the intensity of the fluorescent signal at eachposition reflects the amount of material in each sample, the state ofmethylation at each CpG island is determined by measuring thefluorescent signal at a methylation (M) or unmethylation (U) position onthe microarray.

The reference sample produces signal at the unemethylated (U) positionon the microarray in a second fluorescent probe detection channel. Thisserves as internal negative control and increases the reliability ofresults obtained from this assay. If the signal in the first fluorescentprobe detection channel (i.e., test sample) is similar to the signalpattern in the second fluorescent probe channel (i.e., referencesample), the test sample is unmethylated at the cytosine position withinthe CpG island of interest. Conversely, if the signal in the firstfluorescent probe detection channel (i.e., test sample) is differentfrom the signal in the second fluorescent probe channel (i.e., referencesample), the test sample is methylated at the cytosine position withinthe CpG island of interest.

Microarray Design

In a test sample, all or most unmethylated (U) cytosine residues convertto thymidine during bisulfite conversion. Methylated (M) cytosineresidues do not convert and retain their structure. The firstfluorescent labeled test target RNA contains uracil residues at everycytosine position in the original starting material except for cytosinepositions that were methylated. In that case, cytosine residues arepresent in the test target RNA at every cytosine position in theoriginal starting material.

-   5′--------------c------------3′ target molecule from methylated (M)    sample-   5′--------------u------------3′ target molecule from    unmethylated (U) sample

The interaction between the target and the capture probe on themicroarray is able to discriminate between a perfect-match hybrid andsingle-mismatch hybrid under appropriate high stringency hybridizationconditions. Appropriate high stringency hybridization conditionsinclude, for example, hybridization in 50 μl 6×SSPE, 5× Denhardt'sreagent (Sigma), 0.05% Tween 20 at 50° C. for 6 hours, followed bywashing with 300 μl of 6×SSC and 0.05% Tween 20 at 50° C., followed bywashing with 300 μl of 2×SSC at 48° C., followed by washing with 300 μlof 1×SSC and finally with 300 μl 0.5×SSC both at room temperature.

A more preferable high stringent condition is conducting hybridizationat 55° C. in the solution above instead of 50° C. However, other thantemperature, several factors, such as salt concentration, can influencethe stringency of hybridization and one skilled in the art can suitablyselect the factors to accomplish a similar stringency.

The perfect-match hybrid on the microarray generates a fluorescentsignal when imaged with an optical detection device. Alternatively othercommon microarray detection technologies can be used, such aselectrochemical detection on microarray devices have electrodes andelectronic signal hybridization detection technologies. Thesingle-mismatch hybrid is thermodynamically unstable and does not form astable hybrid. Therefore, no detectable signal is generated at thatposition on the microarray device.

The microarray is designed to have at least one capture probe for themethylated target (M) and one capture for the unmethylated target (U).Example sequences are shown below:

Methylated Sample Capture Probe Sequence (M)

-   5′-uauuuuuuuagguagcggguaguaguuguuu-3′ target sequence [SEQ ID NO. 1]-   3′-auaaaaaaauccaucgcccaucaucaacaaa-5′ capture probe sequence (3′-5′)    [SEQ ID NO. 2]-   5′-aaacaacuacuacccgcuaccuaaaaaaaua-3′ capture probe sequence (5′-3′)    [SEQ ID NO. 3]    Unmethylated Sample Capture Probe Sequence (U)-   5′-uauuuuuuuagguaguggguaguaguuguuu-3′ target sequence [SEQ ID NO. 4]-   3′-auaaaaaaauccaucacccaucaucaacaaa-5′ capture probe sequence (3′-5′)    [SEQ ID NO. 5]-   5′-aaacaacuacuacccacuaccuaaaaaaaua-3′ capture probe sequence (5′-3′)    [SEQ ID NO. 6]

Multiple methylation assays are performed at the same time on onemicroarray. Here, methylation assays ‘1 through . . . ’ are performed inparallel. Each methylation assay has its own pair of methylated (M) andunmethylated (U) capture probes on the microarray.

Since the reference sample undergoes an initial PCR step, onlyunmethylated cytosine is present in the DNA amplicon prior totranscription. The reference sample represents signal from anunmethylated source. The test sample is not treated with an initial PCRstep and the methylation state of the cytosine residues is retainedprior to sodium bisulfite conversion. The methylation state isdetermined by comparing the signal intensities of the test sample in asecond fluorescent probe channel to the test sample in a firstfluorescent probe channel. If there is equal signal intensity in boththe first and second channels the test sample is unmethylated at thespecific CpG island of interest. If there is signal intensity in thefirst fluorescent probe channel but not in the second fluorescent probechannel, the test sample is methylated (see FIG. 4).

FIG. 5 shows results from a multiplexed two-color methylation detectionassay performed on a microarray. Multiple methylation sites (CpGislands) are detected at one time on the same microarray (1 to . . . ).The ‘M’ position contains capture probe sequences for a methylatedsample. The ‘M’ capture probe contains a guanosine residue at thecytosine position of the original cytosine in CpG island and adenosineresidues at all other cytosine positions. The ‘U’ position containscapture probe sequences for an unmethylated sample. The ‘U’ captureprobe contains adenosine residues at all cytosine positions in theoriginal sample.

In the data illustrated in FIG. 4, the second fluorescent probe signalsobtained from the reference sample represent the pattern obtained by anunmethylated sample. Very little or no signal is detected at theMethylation probe (M) and a large amount of signal is detected at theUnmethylation probe (U). This pattern establishes the reference signalfor an unmethylated sample. The first fluorescent probe signalsrepresent the methylation state of the test sample. The microarraydetermines the methylation state of hundreds or thousands of CpG islandsin a multiplex fashion. The signal at position 1 shows the results fromCpG island number 1. The signal at position 2 shows the results from CpGisland number 2, and so on. The signal at position 1 displays strongfluorescent signal at the Unmethylation probe (U) and very little or nosignal at the Methylation probe (M). This pattern is similar to thesignal pattern in the reference sample for position 1. Theinterpretation is made that CpG island number 1 is unmethylated (U).

Further, in FIG. 4, the signal at position 2 displays strong fluorescentsignal at both the Unmethylation probe (U) and the Methylation probe(M). This pattern is different from the signal pattern in the referencesample for position 2. Since both Methylation and Unmethylation probesgenerate fluorescent signal in the test sample, the interpretation ismade that CpG island number 2 has both a methylated and unmethylatedallele (M/U). The signal at position 3 displays strong fluorescentsignal at the Methylation probe (M) and very little or no signal at theUnmethylation probe (U). This pattern is different to the pattern in thereference sample for position 3. The interpretation is made that CpGisland number 3 is methylated (M).

By following the interpretation logic illustrated with the data shown inFIG. 4, the methylation state of hundreds or thousand of CpG islands canbe determined on a single microarray at one time. Once the fluorescentsignal is obtained, the process of interpretation, or calling themethylation state, can be performed using suitable computers andsoftware algorithms. This permits rapid interpretation of assay resultswith the need for human intervention.

Capture Probe Library Screening

Methylation of cytosine residues generally occur within CpG dinucleotidepositions. Computer algorithms that scan through known regions of thegenome predict sites where methylation of cytosine may occur(http://www.uscnorris.com/cpgislands, Takai and Jones, Proc. Natl. Acad.Sci. 19; 99(6):3740-5, 2002). Additionally, computer databases store DNAmethylation sites and allow searching and retrieval of DNA sequencesaround these sites (http://www.methdb.net/).

The microarrays are composed of hundreds or thousands of differentsequence oligonucleotide capture probes to specifically capture targetmolecules. The target molecules contain either a cytosine residue at theoriginal methylation site if the sample is methylated, or a uracilresidue if the sample is unmethylated.

-   5′--------------c------------3′ target molecule from methylated (M)    sample-   3′--------------g------------5′ capture probe for methylated (M)    sample-   5′--------------u------------3′ target molecule from    unmethylated (U) sample-   3′--------------a------------5′ capture probe for unmethylated (U)    sample

Specific hybridization of the target molecule to its complementarycapture probe is required for robust assay performance. Hybridizationconditions are designed so that a single base mismatch between a captureprobe and target molecule does not form a stable hybrid. For example,hybridization is performed in 50 μl 6×SSPE, 5× Denhardt's reagent(Sigma), 0.05% Tween 20 at 50° C. for 6 hours. The array is washed with300 μl of 6×SSC and 0.05% Tween 20 at 50° C., followed by 300 μl of2×SSC at 48° C., followed by 300 μl of 1×SSC and finally 300 μl 0.5×SSCboth at room temperature.

It is also highly desirable to screen thorough a large number of captureprobes at one time. High throughput screening of tens or hundreds ofdifferent capture probes against the same target permits rapid andcost-effective development of validated probes sets.

A single-base mismatch placed a different positions along a captureprobe can have significant impact on the performance of the captureprobe and in its ability to discriminate between a perfectly matched orsingle base mismatched target. It is desirable to rapidly andeffectively screen a library of hundreds or thousands of differentcapture probes to identify the most reliable sequence. For example, alibrary of sequences is generated by moving the position of the mismatchsequence (methylation position) along the capture probe sequence.Capture probe for methylated (M) sample: 5′--g------------------------3′5′----g----------------------3′ 5′------g--------------------3′5′--------g------------------3′ 5′-----------g---------------3′5′-------------g-------------3′ 5′---------------g-----------3′5′-----------------g---------3′ 5′-------------------g-------3′5′---------------------g-----3′ 5′-----------------------g---3′

In addition, a library of sequences is generated by increasing thelength of the capture probe sequence. Capture probe for methylated (M)sample: 5′-----g-----3′ 5′------g------3′ 5′-------g-------3′5′--------g--------3′ 5′---------g---------3′ 5′----------g----------3′5′-----------g-----------3′ 5′------------g------------3′5′-------------g-------------3′

By combining the probe length and mismatch position, hundreds or eventhousands of different probes are designed for a single CpG islandposition. The semiconductor based microarray system rapidly synthesizesall probes at one time and the entire library is empirically tested by ascreening assay. Only probes with desired performance are selected andused in the final assay.

The microarray device of the present invention refers to a device inwhich oligonucleotides and such are array-immobilized on a plate, andnormally refers to those having nucleotides placed on a plate surfacesuch as glass and silicon. The microarray of the present invention isnot limited to so called spotted microarrays. High-density arraysconstructed by synthesizing various polynucleotides at once on platesare also called “DNA chips”. Such “chips” on which oligonucleotides aresynthesized on plates are also included in the microarray device of thepresent invention.

In the present invention, the term “plate” refers to a material in theform of a sheet on which nucleotides can be immobilized. In some cases,the microarray device itself is referred to as a plate. Namely, a platewith immobilized oligonucleotides is simply referred to as a “plate”.There are no particular limitations on the plate of the presentinvention as long as nucleotides can be immobilized on it, and plates(for example, glass and silicon) generally used for microarraytechnology can be preferably used.

In general, a microarray comprising thousands of polynucleotides spottedonto a plate (the step of immobilizing polynucleotides onto a plate isalso called “printing”) at high density. Normally, these nucleotides arespotted (printed) onto the surface of a non-porous plate. The plategenerally has a glass surface, but a porous membrane such asnitrocellulose membrane may also be used. In a polynucleotide array,polynucleotides can be synthesized in situ. For example, in situsynthesis methods, such as photolithographic technology (Affymetrix) andink jet technology for immobilizing chemical substances (RosettaInpharmatics), are already known, and either technology can be used forconstructing the plates of the present invention. In the presentinvention, “immobilization” onto a plate includes the meaning of the socalled “synthesis”. Those skilled in the art can usually use acommercially available device that allows high-density spotting(printing) to construct, for example, microarrays comprising 10,000 ormore kinds of spots (prints) on a slide glass as necessary in thelaboratory.

In the present invention, polynucleotides can be immobilized onto aplate after they are artificially synthesized. In this case, thepolynucleotides can be synthesized by standard methods well known in theart, for example by using a commercially available automatic DNAsynthesizer.

A preferred embodiment of the plate of this invention is a microarrayplate (device) for use in the detection process of this invention, whichhas a plurality of oligonucleotide-capturing probes designed tohybridize to CpG island sites of DNA samples as bisulfite-converted andnon-converted forms.

Furthermore, in another embodiment, the plate of this invention is amicroarray plate for detecting methylation of cytosine sites in CpGislands in the DNA sample to be tested, on which plate the followingoligonucleotides are immobilized:

(a) an oligonucleotide comprising a sequence complementary to a DNAfragment comprising cytosine sites to be tested in the DNA sample,wherein cytosine sites other than the cytosine sites to be tested aresubstituted with thymines; and

(b) an oligonucleotide comprising a sequence complementary to a DNAfragment comprising cytosine sites to be tested in the DNA sample,wherein all the cytosine sites are substituted with thymines.

The oligonucleotides of (a) and (b) are used as capture probes formethylating samples in the process of this invention.

Furthermore, the various capture probes described in the presentinvention can be used as oligonucleotides to be immobilized onto theplates of this invention.

Furthermore, the present invention provides a kit to be used for thedetection process of this invention. More specifically, the presentinvention provides a kit for detecting methylation of cytosine sites inCpG islands in a DNA sample to be tested. A preferred embodiment of thekit of this invention is, for example, a kit comprising at least one of(a) and (b) described below:

(a) the microarray plate of the present invention,

(b) reagents for bisulfite-conversion and/or DNA labeling reagents.

An example of the DNA labeling reagents is the aforementionedfluorescent labeling substance.

Furthermore, in another embodiment, the kit of the present inventionincludes a kit comprising at least one of (a) and (b) described below.

(a) an oligonucleotide comprising a sequence complementary to a DNAfragment comprising cytosine sites to be tested in the DNA sample,wherein cytosine sites other than the cytosine sites to be tested aresubstituted with thymines; and

(b) an oligonucleotide comprising a sequence complementary to a DNAfragment comprising cytosine sites to be tested in the DNA sample,wherein all the cytosine sites are substituted with thymines.

One skilled in the art can use the oligonucleotides contained in the kitto produce a plate for microarray of this invention appropriately usingconventional methods. Plates thus produced are also included in thepresent invention.

PCR primers used in the method of this invention, positive and negativestandard samples (control samples), instructions indicating the methodof use of the kit, and such, may be packaged with the kit of thisinvention.

BRIEF DESCRIPTION OF DRAWINGS

FIG. 1 shows a schematic of various prior methylation detectionprocesses, some of which are described in the Background section. Thepresent invention can also be added to this scheme as bisulfite based.

FIG. 2 shows a schematic of a preferred embodiment of the inventiveprocess beginning with DNA (A) and being divided into two arms. The leftarm first PCR amplifies the sample (B), then performs bisulfiteconversion and then labeling with a second fluorescent probe (Cy3). Theright arm first bisulfite converts the sample (C), then PCR amplifies,then labels with a first fluorescent probe (Cy5) and finally bothsamples (B+C) are hybridized onto a DNA microarray device for detectionin a two-color fluorescent imaging reaction.

FIG. 3 shows a microarray device layout pattern for studying a large setof CpG island methylation sites in parallel. A minimum of two featuresor microarray sites is required on the microarray device for eachmethylation site that is queried. One feature of the microarray deviceprobes for the presence of methylated cytosine and a second feature ofthe microarray device probes for the presence of unmethylated cytosinein a sample.

FIG. 4 shows results from a multiplexed two-color methylation detectionassay performed on a microarray. Multiple methylation sites (CpGislands) are detected at one time on the same microarray. The ‘M’position contains capture probe sequences for a methylated sample. The‘M’ capture probe contains a guanosine residue at the cytosine positionof the original cytosine in CpG island and adenosine residues at allother cytosine positions. The ‘U’ position contains capture probesequences for an unmethylated sample. The ‘U’ capture probe containsadenosine residues at all cytosine positions in the original sample.

FIG. 5 shows hybridization discrimination between a perfectly-matched15mer DNA target and single-mismatch 15mer DNA target hybridized underhigh stringency conditions. Capture probes were designed such that thesingle mismatch position shifted from the fifth position of the captureprobe to the eleventh position. Spot intensity ratio between match andmismatch samples indicates that maximal discrimination was obtained whenthe mismatch was positioned at the center of the capture probe.

FIG. 6 shows preparation of a reference sample for the sequence regionbeing studied. The sample being tested serves as its own internalreference control according to the inventive process. Amplification ofDNA by the first PCR step strips all methylation information from thesample. During PCR1, methylated and unmethylated cytosine residuesproduce unmethylated cytosine residues in the amplicon. The ampliconundergoes bisulfite conversion and cytosine residues are converted todeoxyuracil, which behave as thymidine residues in further enzymatic andannealing reactions. The bisulfite-converted product undergoes a secondPCR step to add an upstream transcriptional promoter. The finalCy3-labeld transcript is used as a reference sample that generates theunmethylated signal pattern on the microarray.

FIG. 7 shows preparation of a test sample for the region being studied.The sample is treated with sodium bisulfite to convert unmethylatedcytosine residues to deoxyuracil, which behave as thymidine residues infurther enzymatic and annealing reactions. Methylated cytosine residuesare protected during the conversion and retain their original cytosinestructure. The bisulfite-converted products undergo a PCR step to add anupstream transcriptional promoter. The final Cy5-labeld transcript isused as the test sample to generate methylated or unmethylated signalson the microarray that reflect the original methylation state of thesample being tested.

BEST MODE FOR CARRYING OUT THE INVENTION Example 1

This example illustrates a multiplex methylation assay using theinventive procedure and a microarray device from CombiMatrix Corp. (madeusing in situ synthesis with an electrochemical process).

Sample Collection and DNA Purification.

Homo sapiens DNA mismatch repair (hMLH1) gene (GenBank ACCESSIONU83845), a human primary colon carcinoma, cell line SW480 (purchasedfrom ATCC; http://www.atcc.org/), and 293T (purchased from ATCC) wereused. DNA was purified as follows. The cells were cultivated, and werecollected from dishes. The collected cells were washed in PBS threetimes. The washed cells were centrifugated, and stored at −80° C. Cellaggregates were suspended in reaction buffer, and were digested byproteinase K. After digestion, the aqueous phases were extracted with, a1:1 mixture of equilibrated phenol and chloroform. DNA was recovered by70% ethanol precipitation, and was suspended in pure water or TE (1.0mg/mL).

Sequence being investigated: 5′atcacctcagcagaggcacacaagcccggttccggcatctctgctcctattggctggatatt [SEQ IDNO. 7] tcgtattccccgagctcctaaaaacgaaccaataggaagagcggacagcgatctctaacgcgcaagcgcatatccttctaggtagcgggcagtagccgcttcagggagggacgaagagacccagcaacccacagagttgagaaatttgactggcattcaagctgtccaatcaatagctgccgctgaagggtggggctggatggcttaagctacagctgaaggaagaacgtgagcacgaggcactgaggtgattggctgaaggcacttccgttgagcatctagacgtttccttggctcttctggcgccaaaatg-3′

Genomic DNA (100 ng) was aliquoted into two tubes, one labeled referencesample and the other labeled test sample.

Test Sample Preparation

The Reference sample undergoes a PCR amplification with the followingtwo primers: Forward primer 1 (F1): 5′-atcacctcagcagaggcacac-3′ [SEQ IDNO. 8] Reverse primer 1 (R1): 5′-tttggcgccagaagagccaag-3′ [SEQ ID NO. 9]

PCR amplification was performed in a total volume of 50 μl containing1×PCR Gold® buffer (Applied Biosystems, Foster City, Calif.), 2 mMMgCl₂, 0.2 mM deoxynucleotide triphosphates mixture (USB), 10 pmolforward primer (F1, SEQ ID NO. 8), 10 pmol reverse primer (R1, SEQ IDNO. 9), 2 U Amplitaq Gold DNA® polymerase (ABI), and 100 ng templateDNA. Reaction conditions were as follows: 95° C. for 10 minutes, and 39cycles of 92° C. for 30 seconds, 57° C. for 30 seconds, and 70° C. for30 seconds, with a final elongation for 7 minutes at 70° C. The PCRproducts were analyzed by gel electrophoresis using a 2.5% agarose gel,stained with ethidium bromide and visualized under UV illumination witha digital imaging system (NucleoTech). Amplicons were purified usingQIAquick® PCR purification kits (Qiagen) following manufacturersprotocol.

Sodium Bisulfite Conversion (SBC)

One microgram purified amplicon was diluted in 50 μl of distilled waterand denatured by addition of 1 μl 10 M sodium hydroxide to a finalconcentration of 0.2 M and incubated for 10 minutes at 37° C. Afterincubation, 30 μl 10 mM hydorguinone (Sigma) and 520 μl of 3M sodiumbisulfite (Sigma) at pH 5.0 were added. The solution was incubated at53° C. for 18-20 hours.

DNA was purified by QIAquick® purification kits (Qiagen) following themanufacturer's protocol. The DNA was desulfonated with 0.3 M sodiumhydroxide for 10 minutes at room temperature, neutralized with 17 μl of10 M ammonium acetate (Ambion) and then precipitated in 100% ethanol at−80° C. overnight.

The second PCR amplification of the reference sample used the followingprimers: [SEQ ID NO. 10] Forward primer 2 (F₂):5′-taatacgactcactatagggattattttagtagaggtatat-3′ [SEQ ID NO. 11] Reverseprimer 1 (R₂): 5′-tttggtgttagaagagttaag-3′

Amplification was performed in a total volume of 50 μl containing 1×PCRGold® buffer (Applied Biosystems, Foster City, Calif.), 2 mM MgCl₂, 0.2mM deoxynucleotide triphosphates mixture (USB), 10 pmol forward primer 2(F₂, SEQ ID NO. 10), 10 pmol reverse primer (R₂, SEQ ID NO. 11), 2 UAmplitaq Gold DNA® polymerase (Applied Biosystems, Foster City, Calif.),and 100 ng template DNA. Reaction conditions were as follows: 95° C. for10 minutes, and 39 cycles of 92° C. for 30 seconds, 57° C. for 30seconds, and 70° C. for 30 seconds, with a final elongation for 7minutes at 70° C. The PCR products were analyzed by gel electrophoresisusing a 2.5% agarose gel, stained with ethidium bromide and visualizedunder UV illumination with a digital imaging system (NucleoTech).Amplicons were purified using QIAquick® PCR purification kits (Qiagen)following manufacturers protocol.

Transcription

One microgram purified amplicon containing T7 promoter sequence wastranscribed in vitro in a total volume of 20 μl using MEGAscript® Kits(Ambion) following the manufacturers protocol with the addition of 5 μlof 10 mM Cy3 UTP (Amersham). The transcripts were purified using RNeasy®purification kits (Qiagen) following the manufacturer's protocol.

Test Sample Preparation

The test sample is first converted in a sodium bisulfite conversion(SBC) step. Briefly, one microgram genomic DNA was diluted in 50 μl ofdistilled water and denatured by addition of 1 μl 10 M sodium hydroxideto a final concentration of 0.2 M and incubated for 10 minutes at 37° C.After incubation, 30 μl 10 mM hydorquinone (Sigma) and 520 μl of 3Msodium bisulfite (Sigma) at pH 5.0 were added. The solution wasincubated at 53° C. for 18-20 hours.

The DNA was desulfonated with 0.3 M sodium hydroxide for 10 minutes atroom temperature, neutralized with 17 μl of 10 M ammonium acetate(Ambion) and then precipitated in 100% ethanol at −80° C. overnight.

The test sample genomic DNA was then amplified by a PCR 2 amplificationusing the following primers: [SEQ ID NO. 12] Forward primer 2 (F₂):5′-taatacgactcactatagggattattttagtagaggtatat-3′ [SEQ ID NO. 13] Reverseprimer 1 (R₂): 5′-tttggtgttagaagagttaag-3′

PCR amplification was performed in a total volume of 50 μl containing1×PCR Gold buffer (Applied Biosystems, Foster City, Calif.), 2 mM MgCl₂,0.2 mM deoxynucleotide triphosphates mixture (USB), 10 pmol forwardprimer 2 (F₂), 10 pmol reverse primer (R₂), 2 U Amplitaq Gold DNApolymerase (Applied Biosystems, Foster City, Calif.), and 100 ngtemplate DNA. Reaction conditions were as follows: 95° C. for 10minutes, and 39 cycles of 92° C. for 30 seconds, 57° C. for 30 seconds,and 70° C. for 30 seconds, with a final elongation for 7 minutes at 70°C. The PCR products were analyzed by gel electrophoresis using a 2.5%agarose gel, stained with ethidium bromide and visualized under UVillumination with a digital imaging system (NucleoTech). Amplicons werepurified using QIAquick PCR purification kits (Qiagen) followingmanufacturer's protocol.

The test sample was then subject to transcription. Briefly, onemicrogram of purified amplicon containing T7 promoter sequence wastranscribed in vitro in a total volume of 20 μl using MEGAscript Kits®(Ambion) following the manufacturers protocol with the addition of 5 μlof 10 mM Cy5 UTP (Amersham). The transcripts were purified using RNeasy®purification kits (Qiagen) following the manufacturer's protocol.

Hybridization and Wash

The reference transcript (4 μg) was labeled with a second fluorescentdye, preferably Cy3 and combined with the test transcript (4 μg) labeledwith a first fluorescent dye, preferably Cy5 in 50 μl 6×SSPE, 5×Denhardt's reagent (Sigma), 0.05% Tween 20. The mixture was hybridizedwith a microarray device at 50° C. for 6 hours. The microarray waswashed with 300 μl of 6×SSC and 0.05% Tween 20 at 50° C., followed by300 μl of 2×SSC at 48° C., followed by 300 μl of 1×SSC and finally 300μl 0.5×SSC both at room temperature.

Microarray Device

The microarray is designed to have at least one capture probe for themethylated target (M) and one capture for the unmethylated target (U).

Methylated Sample Capture Probe Sequence (M)

-   5′-uauuuuuuuagguagcggguaguaguuguuu-3′ target sequence [SEQ ID NO. 1]-   3′-auaaaaaaauccaucgcccaucaucaacaaa-5′ capture probe sequence (3′-5′)    [SEQ ID NO. 2]-   5′-aaacaacuacuacccgcuaccuaaaaaaaua-3′ capture probe sequence (5′-3′)    [SEQ ID NO. 3]    Unmethylated Sample Capture Probe Sequence (U)-   5′-uauuuuuuuagguaguggguaguaguuguuu-3′ target sequence [SEQ ID NO. 4]-   3′-auaaaaaaauccaucacccaucaucaacaaa-5′c apture probe sequence (3′-5′)    [SEQ ID NO. 5]-   5′-aaacaacuacuacccacuaccuaaaaaaaua-3′ capture probe sequence (5′-3′)    [SEQ ID NO. 6]    Imaging and Data Analysis

After the final wash step, the microarray was imaged using an opticaldetection instrument having a CCD camera (arrayWoRx Biochip Reader,Applied Precision). Two images were captured from each microarraycorresponding to the emission wavelength of each fluorescent dye. Theimages were saved on a microcomputer and analyzed followingmanufacturer's instructions (softWoRx Tracker, Applied Precision). Thefluorescent intensity at each position on the microarray was quantifiedand saved as a spreadsheet containing probe sequence and positioninformation as well as fluorescent intensity of each dye.

Intensity data was analyzed, for example, by calculating the ratio ofsignal intensity between the test sample having the first fluorescentdye and the reference sample having the second fluorescent dye. In thismanner, the methylation state of cytosine residues within a CpG islandis determined.

Ratio analysis of data for probe signal and reference signal:$R_{m} = \frac{\left( \frac{M_{test}}{M_{ref}} \right)}{\left( \frac{U_{test}}{U_{ref}} \right)}$

EXAMPLE 2

This example performs the analysis of multiple CpG islandssimultaneously using the procedure described in Example 1 for a singleCpG island methylation determination. Signal intensities shown in FIG. 4are analyzed by calculating the ratios for each probe in both the testand reference signal channels. For example, the first row of themicroarray shown in FIG. 4 is designed to assess the methylation stateof region 1 in a sample. The second row is designed to assess themethylation state in region 2 of the same sample, and so on. Each regionin the sample may be methylated (M) or unmethylated (U). The sample isprepared as describe in Example 1. The sample being tested is used asits own internal reference control. Preparation of the reference sampleremoves any methylation that may have been present. The reference targetis labeled with Cy3 fluorescent dye, for example, and its signal appearsin the Cy3 detection channel. The test sample is processed so thatmethylation is retained during preparation. The test sample is labeledwith Cy5 fluorescent dye, for example, and its signal appears in the Cy5detection channel. FIG. 4 shows signal intensities appearing in eachchannel in black.

The results for region 1 are shown in the first row of the microarray.The reference sample appears with fluorescent signal at the unmethylatedprobe (U) position as expected. The test sample also appears with signalat the unmethylated probe (U) position, similar to the reference sample.By calculating these results using the formula provided in Example 1,the interpretation is made that the sample is unmethylated (U) in region1.

The results for region 2 are shown in the second row of the microarray.The reference sample appears with fluorescent signal at the unmethylatedprobe (U) position as expected. The test sample appears with signal atboth the unmethylated probe (U) position and the methylated (M) probeposition. By calculating these results using the formula provided inExample 1, the interpretation is made that the sample contains bothmethylated (M) and unmethylated (U) cytosine in region 2 in approximateequal proportions. This occurs when, for example, one allele in thesample is methylated and the other allele is unmethylated. This may alsooccur in a heterogeneous cell population where approximately one half ofthe cells are methylated in region 1.

The results for region 3 are shown in the third row of the microarray.The reference sample appears with fluorescent signal at the unmethylatedprobe (U) position as expected. The test sample appears with signal onlyat the methylated probe (M) position. By calculating these results usingthe formula provided in Example 1, the interpretation is made that thesample is entirely methylated (M) in region 3.

Microarrays containing tens, or hundreds, or even thousands positionscan be used to determine the methylation state of tens, hundreds or eventhousands of different CpG island regions within the same sample and inparallel.

1. A process that simultaneously detects methylation at multiple CpGisland sites using a reference sample obtained from a sample to betested, wherein the process is a nucleic acid methylation detectionprocess that uses an internal reference sample and comprises the stepsof: using a DNA sample for analysis, that is divided into a first DNAsample to be tested and a second DNA sample to be the internalreference, to amplify the second DNA sample such that methylcytosineresidues are amplified as unmethylated cytosine residues; converting theunmethylated cytosine residues to deoxyuracil residues in both the firstDNA sample and the second DNA sample; using a first fluorescent markerand a second fluorescent marker having non-overlapping fluorescentexcitation and fluorescent emission spectra to label the first DNAsample with the first fluorescent marker and to label the second DNAsample with the second fluorescent marker; and hybridizing the first DNAsample and the second DNA sample onto a microarray device having aplurality of oligonucleotide capture probes designed to hybridize to CpGisland sites of the DNA sample as converted and non-converted forms. 2.A process that simultaneously detects methylation at a large number ofCpG island sites using a reference sample obtained from a sample to betested, comprising: (a) providing a DNA sample for analysis; (b)dividing the DNA sample into a first DNA sample and a second DNA sample,whereby the first sample will become a test sample and the second samplewill become an internal reference sample; (c) amplifying the second DNAsample by a nucleic acid amplification process such that methylcytosineresidues are amplified as unmethylated cytosine residues; (d) bisulfiteconversion of unmethylated cytosine residues into deoxyuracil residuesin both the amplified first DNA sample and the second DNA sample; (e)amplifying the converted first DNA sample and the converted second DNAsample; (f) labeling the bisulfite-converted second DNA sample with asecond fluorescent marker and the bisulfite-converted first DNA samplewith a first fluorescent marker, wherein the first and secondfluorescent markers have non-overlapping fluorescent excitation andemission spectra; and (g) hybridizing the first DNA sample and thesecond DNA sample onto a microarray device having a plurality ofoligonucleotide capture probes designed to hybridize to CpG island sitesof the DNA sample as converted and non-converted by bisulfite.
 3. Theprocess of claim 1, wherein the amplification technique employed is PCR(polymerase chain reaction).
 4. The process of claim 1, wherein thehybridization conditions are highly stringent conditions.
 5. The processof claim 1, wherein the non-overlapping fluorescent labels are Cy3,(1,1′-bis(ε-carboxypentyl)-1′ethyl-3,3,3′,3′-tetramethylindocarbocyanine-5,5′-disulfonatepotassium salt di-N-hydroxysuccinimide ester) and Cy5(1,1′-bis(ε-carboxypentyl)-1′ethyl-3,3,3′,3′-tetramethylindodicarbocyanine-5,5′-disulfonatepotassium salt di-N-hydroxysuccinimide ester).
 6. A microarray plate fordetecting methylation at cytosine sites in CpG islands in a DNA sampleto be tested, on which plate the following oligonucleotides areimmobilized: (a) an oligonucleotide comprising a sequence complementaryto a DNA fragment comprising cytosine sites to be tested in the DNAsample, wherein cytosine sites other than the cytosine sites to betested are substituted with thymines; and (b) an oligonucleotidecomprising a sequence complementary to a DNA fragment comprisingcytosine sites to be tested in the DNA sample, wherein all the cytosinesites are substituted with thymines.
 7. A kit for detecting methylationat cytosine sites in CpG islands in a DNA sample to be tested, whichcomprises: (a) the microarray plate of claim 6, and (b) reagents forbisulfite-conversion and/or DNA labeling reagents.
 8. A kit fordetecting methylation at cytosine sites in CpG islands in a DNA sampleto be tested, which comprises: (a) an oligonucleotide comprising asequence complementary to a DNA fragment comprising cytosine sites to betested in the DNA sample, wherein cytosine sites other than the cytosinesites to be tested are substituted with thymines; and (b) anoligonucleotide comprising a sequence complementary to a DNA fragmentcomprising cytosine sites to be tested in the DNA sample, wherein allthe cytosine sites are substituted with thymines.